↜ Back to index Introduction to Numerical Analysis 2

Part b–Lecture 7

In today’s lecture we consider a more interesting sparse matrix. Our goals are to:

initialize a compressed sparse row storage for this matrix
update the CG algorithm to use this matrix
update the preconditioner to use this matrix

Let us recall the sparse matrix considered in Report 1.

We fix a natural number q and set n = q^2. The matrix A will be the n \times n matrix with entries A_{ij} = \begin{cases} 4, & i=j,\\ -1, & \Big(|i - j| = 1 \text{ and } \min(i,j) \mod q \neq 0\Big) \text{ or } |i - j| = q,\\ 0, & \text{otherwise}. \end{cases}

Recall that \min(i,j) \mod q \neq 0 means that \min(i,j) is not divisible by q.

This is how the matrix looks for q = 3: \begin{pmatrix} 4&-1& 0&-1& 0& 0& 0& 0& 0\\ -1& 4&-1& 0&-1& 0& 0& 0& 0\\ 0&-1& 4& 0& 0&-1& 0& 0& 0\\ -1& 0& 0& 4&-1& 0&-1& 0& 0\\ 0&-1& 0&-1& 4&-1& 0&-1& 0\\ 0& 0&-1& 0&-1& 4& 0& 0&-1\\ 0& 0& 0&-1& 0& 0& 4&-1& 0\\ 0& 0& 0& 0&-1& 0&-1& 4&-1\\ 0& 0& 0& 0& 0&-1& 0&-1& 4 \end{pmatrix}

and for q = 4:

Note that every qth element on the diagonals next to the main diagonal is 0.

Exercise 1.

What is the number of nonzero elements of A for a given q?
(Challenge) Make a Fortran program that reads q and initializes the compressed sparse row storage arrays v, rb and ci for the matrix A given above, and prints them.

Check that the result is correct for q=2 and q=3.

Our goal is now to use this matrix in the preconditioned conjugate gradient method that we implemented in Lecture b3.

This is fortunately not very difficult and is similar to what we did in Lecture b5. We only need to replace the calls to mv_mul by calls to sparse_csr_mul presented in the previous lecture, and replace the initialization of a by the initialization of v, rb and ci from Exercise 1 above, and modify the function precond to use the compressed sparse row representation of the matrix.

Preconditioner

We used the SSOR preconditioner in Lecture b3.

This is the actual algorithm that we implemented in function precond in Lecture b3:

For i = 1, 2, \ldots, n do:
- z_i \leftarrow \frac1{A_{ii}}\big((2- \omega) r_i - \omega \sum_{j=1}^{i - 1} A_{ij} z_j\big)
For i = n, n-1, \ldots, 1 do:
- z_i \leftarrow \omega z_i - \frac{\omega}{A_{ii}} \sum_{j=i+1}^n A_{ij}z_j

Exercise 2. Check that the implementation of precond in Lecture 4 matches the algorithm above.

What we need is an implementation of the precond function for a matrix represented in the compressed sparse row storage. This makes the implementation a bit more tricky.

Preconditioned CG method with compressed row storage

Here is the full preconditioned CG method program for the matrix considered in today’s lecture:

implicit none
real, parameter :: eps = 1e-10
real, dimension(:), allocatable :: b, x, r, p, z, Ap, v
integer, dimension(:), allocatable :: rb, ci
real rho, rho_old, alpha, om
integer k, i, n, q, t

read *, q, om

! Initialize CSR storage for A
n = q**2
k = 5 * q**2 - 4 * q

allocate(v(k))
allocate(rb(n + 1))
allocate(ci(k))

! Counter of where to put next entry in v and ci.
t = 1
! loop over rows
do i = 1, n
    rb(i) = t

    ! consider the 5 possible nonzero entries in this row
    if (i - q >= 1) then
        v(t) = -1
        ci(t) = i - q
        t = t + 1
    end if

    if (i - 1 >= 1 .and. mod(i - 1, q) /= 0) then
        v(t) = -1
        ci(t) = i - 1
        t = t + 1
    end if

    v(t) = 4
    ci(t) = i
    t = t + 1

    if (i + 1 <= n .and. mod(i, q) /= 0) then
        v(t) = -1
        ci(t) = i + 1
        t = t + 1
    end if

    if (i + q <= n) then
        v(t) = -1
        ci(t) = i + q
        t = t + 1
    end if
end do
rb(n + 1) = k + 1


! preconditioned CG algorithm
allocate(b(n))
allocate(x(n))
allocate(r(n))
allocate(p(n))
allocate(z(n))
allocate(Ap(n))

b = 1. / (q + 1)**2

x = 0
r = b
z = precond_csr(v, rb, ci, r, om)
rho = inner(r, z)
p = z

do k = 1, n
    Ap = sparse_csr_mul(v, rb, ci, p)
    alpha = rho  / inner(Ap, p)
    x = x + alpha * p
    r = r - alpha * Ap
    z = precond_csr(v, rb, ci, r, om)
    rho_old = rho
    rho = inner(r, z)
    print *, k, sqrt(inner(r, r))
    if (sqrt(inner(r, r)) < eps) exit
    p = z + (rho / rho_old) * p
end do

contains 
    ! SSOR preconditioner: computes z = M⁻¹r where M is the SSOR matrix
    ! for matrix A with parameter ω.
    ! The matrix A is given in a compressed sparse row representation.
    ! Assumes that the matrix has nonzero diagonal elements
    ! and that elements in each row are stored from left to right!
    function precond_csr(v, rb, ci, r, om) result(z)
        real v(:), r(:), z(size(r))
        integer rb(:), ci(:)
        integer i, j
        real s, a, om
        ! iterate over rows
        do i = 1, size(r)
            s = 0
            ! go over the nonzero elems in row i
            do j = rb(i), rb(i+1) - 1
                if (ci(j) == i) then     ! col == row
                    ! we reached the diagonal element
                    a = v(j)
                    exit
                endif
                s = s + v(j) * z(ci(j))
            enddo
            z(i) = ((2 - om) * r(i) - om * s) / a
        enddo 

        ! iterate over rows in the reverse order
        do i = size(r), 1, -1
            s = 0
            ! go over the nonzero elems in row i in reverse order
            do j = rb(i+1) - 1, rb(i), -1
                if (ci(j) == i) then    ! col == row
                    ! we reached the diagonal element
                    a = v(j)
                    exit
                endif
                s = s + v(j) * z(ci(j))
            enddo
            z(i) = om * z(i) - om * s / a
        enddo 
    end

    function inner(x, y)
        real x(:), y(:), inner
        integer i
        if (size(x) /= size(y)) then
            stop 'arrays x and y must have the same size'
        end if

        inner = 0
        do i = 1, size(x)
            inner = inner + x(i) * y(i)
        end do  
    end

    function sparse_csr_mul(v, rb, ci, x) result(y)
        real v(:), x(:), y(size(x))
        integer rb(:), ci(:)
        integer i,j

        do i = 1,size(x)
            y(i) = 0
            do j = rb(i),rb(i+1) - 1
                y(i) = y(i) + v(j) * x(ci(j)) 
            enddo
        enddo
    end
end

Exercise 3. Use b_i = 1/(q + 1)^2, \varepsilon = 10^{-10} in the above program.

Run the program for q = 500 and each \omega = \{1.9, 1.905, 1.91, \ldots, 1.995\} (that is, 20 different \omega’s) and find the number of iterations that the program requires for given \omega. Make a plot of the number of iterations as a function of \omega in gnuplot. Use with linespoints.

Submit the plot with your student ID as the title.